What is nearley?
The nearley npm package is a fast, feature-rich, and modern parser toolkit for JavaScript. It is based on Earley's algorithm and can be used to create parsers for complex, context-free grammars. nearley is designed to be simple to use and extend, making it a good choice for building compilers, interpreters, and other language-related tools.
What are nearley's main functionalities?
Grammar Definition
This feature allows you to define a grammar for your language. The grammar is written in a simple, JSON-like format and compiled into a parser.
{"module.exports = grammar({main: $ => ['hello', $.world],world: $ => 'world'});"}
Parsing Input
Once you have defined a grammar, you can create a parser and feed it input to parse. The parser will output a parse tree or a list of possible parse trees if the input is ambiguous.
{"const nearley = require('nearley');\nconst grammar = require('./your-grammar.js');\nconst parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar));\nparser.feed('hello world');\nconst results = parser.results;\nconsole.log(results);"}
Error Reporting
nearley provides error reporting features that help you understand where and why a parse failed, which is useful for debugging grammars and providing feedback to users.
{"const nearley = require('nearley');\nconst grammar = require('./your-grammar.js');\nconst parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar));\ntry {\n parser.feed('hello wor');\n} catch (error) {\n console.error(error.message);\n}"}
Other packages similar to nearley
pegjs
PEG.js is a simple parser generator for JavaScript that produces fast parsers with excellent error reporting. It uses Parsing Expression Grammars (PEG) as the input. Compared to nearley, PEG.js grammars are arguably easier to read and write but are less powerful in terms of expressing certain types of grammars.
chevrotain
Chevrotain is a high-performance, self-optimizing parser building toolkit for JavaScript. Unlike nearley, which uses Earley's algorithm, Chevrotain is based on parsing techniques that do not require a separate parser generation step. It provides a rich feature set and is particularly well-suited for building complex parsers.
antlr4
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator that supports multiple languages, including JavaScript. ANTLR is more complex than nearley but offers a very rich set of features for building sophisticated language processors. It uses LL(*) parsing which is different from nearley's Earley-based approach.
jison
Jison is an npm package that generates bottom-up parsers in JavaScript. Inspired by Bison, it is capable of handling LR and LALR grammars. Jison can be considered more traditional compared to nearley's modern approach, and it might be more familiar to those with experience in classic parser generators.
nearley is a simple, fast and powerful parsing toolkit. It consists of:
- A powerful, modular DSL for describing
languages
- An efficient, lightweight Earley
parser
- Loads of tools, editor plug-ins, and other
goodies!
nearley is a streaming parser with support for catching errors
gracefully and providing all parsings for ambiguous grammars. It is
compatible with a variety of lexers (we recommend
moo). It comes with tools for creating tests,
railroad diagrams and fuzzers from your grammars, and has support for a
variety of editors and platforms. It works in both node and the browser.
Unlike most other parser generators, nearley can handle any grammar you can
define in BNF (and more!). In particular, while most existing JS parsers such
as PEGjs and Jison choke on certain grammars (e.g. left recursive
ones), nearley handles them
easily and efficiently by using the Earley parsing
algorithm.
nearley is used by a wide variety of projects:
nearley is an npm staff
pick.
Documentation
Please visit our website https://nearley.js.org to get started! You will find a
tutorial, detailed reference documents, and links to several real-world
examples to get inspired.
Contributing
Please read this document before working on
nearley. If you are interested in contributing but unsure where to start, take
a look at the issues labeled "up for grabs" on the issue tracker, or message a
maintainer (@kach or @tjvr on Github).
nearley is MIT licensed.
A big thanks to Nathan Dinsmore for teaching me how to Earley, Aria Stewart for
helping structure nearley into a mature module, and Robin Windels for
bootstrapping the grammar. Additionally, Jacob Edelman wrote an experimental
JavaScript parser with nearley and contributed ideas for EBNF support. Joshua
T. Corbin refactored the compiler to be much, much prettier. Bojidar Marinov
implemented postprocessors-in-other-languages. Shachar Itzhaky fixed a subtle
bug with nullables.
Citing nearley
If you are citing nearley in academic work, please use the following BibTeX
entry.
@misc{nearley,
author = "Kartik Chandra and Tim Radvan",
title = "{nearley}: a parsing toolkit for {JavaScript}",
year = {2014},
doi = {10.5281/zenodo.3897993},
url = {https://github.com/kach/nearley}
}